Factors Affecting the Accuracy of Korean Parsing

نویسندگان

  • Tagyoung Chung
  • Matt Post
  • Daniel Gildea
چکیده

We investigate parsing accuracy on the Korean Treebank 2.0 with a number of different grammars. Comparisons among these grammars and to their English counterparts suggest different aspects of Korean that contribute to parsing difficulty. Our results indicate that the coarseness of the Treebank’s nonterminal set is a even greater problem than in the English Treebank. We also find that Korean’s relatively free word order does not impact parsing results as much as one might expect, but in fact the prevalence of zero pronouns accounts for a large portion of the difference between Korean and English parsing scores.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تأثیر ساخت‌واژه‌ها در تجزیه وابستگی زبان فارسی

Data-driven systems can be adapted to different languages and domains easily. Using this trend in dependency parsing was lead to introduce data-driven approaches. Existence of appreciate corpora that contain sentences and theirs associated dependency trees are the only pre-requirement in data-driven approaches. Despite obtaining high accurate results for dependency parsing task in English langu...

متن کامل

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

Probabilistic Parsing of Korean Sentences Using Collocational Information

Lexical information is one of the most important source that can improve the accuracy of the syntactic disambigua-tion. This paper describes a Korean probabilistic parser that is based on the probabilities of phrase structure rules as well as the probabilities of collocational information between lexical items to resolve syntactic ambiguity. The proposed parser is shown by means of an extensive...

متن کامل

بررسی مقایسه‌ای تأثیر برچسب‌زنی مقولات دستوری بر تجزیه در پردازش خودکار زبان فارسی

In this paper, the role of Part-of-Speech (POS) tagging for parsing in automatic processing of the Persian language is studied. To this end, the impact of the quality of POS tagging as well as the impact of the quantity of information available in the POS tags on parsing are studied. To reach the goals, three parsing scenarios are proposed and compared. In the first scenario, the parser assigns...

متن کامل

Probabilistic Language Model for Analyzing Korean Sentences

In this paper, we introduce a restricted form of phrase structure grammar to handle the characteristics of Korean more eeciently. Based on this restricted form of the grammar, we propose a probabilistic parser for Korean sentences. To show usefulness of the parser proposed in this paper, we made a preliminary experiment. We extract a set of rules from about 1,682 tree annotated sentences. The e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010